IEICE global.ieice.org Site

Keyword Search Result

[Keyword] motion estimation(131hit)

21-40hit(131hit)

H.264 Fast Inter-Mode Selection Based on Coded Block Patterns
Shih-Hsuan YANG Bo-Yuan CHEN Kuo-Hsin WANG

LETTER-Image Processing and Video Processing

Vol:
E92-D No:6
Page(s):
1324-1327
A new H.264 fast inter-mode decision algorithm based on coded block patterns is presented. Compared to the exhaustive mode search, the proposed method achieves an average 57 % reduction in computation time with negligible degradation in visual quality. The speed and rate-distortion performance is comparable to known fast algorithms that involve more elaborate mechanisms.
An Ultra-Low Bandwidth Design Method for MPEG-2 to H.264/AVC Transcoding
Xianghui WEI Takeshi IKENAGA Satoshi GOTO

PAPER

Vol:
E92-A No:4
Page(s):
1072-1079
Motion estimation (ME) is a computation and data intensive module in video coding system. The search window reuse methods play a critical role in bandwidth reduction by exploiting the data locality in video coding system. In this paper, a search window reuse method (Level C+) is proposed for MPEG-2 to H.264/AVC transcoding. The proposed method is designed for ultra-low bandwidth application, while the on-chip memory is not a main constraining factor. By loading search window for the motion estimation unit (MEU) and applying motion vector clipping processing, each MB in MEU can utilize both horizontal and vertical search reuse. A very low bandwidth level (Rα<2) can be achieved with an acceptable on-chip memory.
A Fast Block Matching Algorithm Based on Motion Vector Correlation and Integral Projections
Mohamed GHONEIM Norimichi TSUMURA Toshiya NAKAGUCHI Takashi YAHAGI Yoichi MIYAKE

PAPER-Image Processing and Video Processing

Vol:
E92-D No:2
Page(s):
310-318
The block based motion estimation technique is adopted by various video coding standards to reduce the temporal redundancy in video sequences. The core of that technique is the search algorithm implemented to find the location of the best matched block. Indeed, the full search algorithm is the most straightforward and optimal but computationally demanding search algorithm. Consequently, many fast and suboptimal search algorithms have been proposed. Reduction of the number of location being searched is the approach used to decrease the computational load of full search. In this paper, hybridization between an adaptive search algorithm and the full search algorithm is proposed. The adaptive search algorithm benefits from the correlation within spatial and temporal adjacent blocks. At the same time, a feature domain based matching criteria is used to reduce the complexity resulting from applying the pixel based conventional criteria. It is shown that the proposed algorithm produces good quality performance and requires less computational time compared with popular block matching algorithms.
Sub-Pixel Motion Estimation Scheme Using Selective Interpolation
Junsang CHO Gwanggil JEON Jungwook SUH Jechang JEONG

LETTER-Multimedia Systems for Communications

Vol:
E91-B No:12
Page(s):
4078-4080
Current sub-pixel motion estimation algorithm is time and memory-consuming when performing image compression and communication. So we propose a selective interpolation method for sub-pixel motion estimation. We applied selective interpolations after estimating a candidate for sub-pixel accuracy motion vector from the simplest mathematical model. According to simulation results, the proposed method attains nearly the same performance as the full-search for half-pixel motion estimation with much lower computational complexity.
Wide-Range Motion Estimation Architecture with Dual Search Windows or High Resolution Video Coding
Lan-Rong DUNG Meng-Chun LIN

PAPER-Embedded, Real-Time and Reconfigurable Systems

Vol:
E91-A No:12
Page(s):
3638-3650
This paper presents a memory-efficient motion estimation (ME) technique for high-resolution video compression. The main objective is to reduce the external memory access, especially for limited local memory resource. The reduction of memory access can successfully save the notorious power consumption. The key to reduce the memory accesses is based on center-biased algorithm in that the center-biased algorithm performs the motion vector (MV) searching with the minimum search data. While considering the data reusability, the proposed dual-search-windowing (DSW) approaches use the secondary windowing as an option per searching necessity. By doing so, the loading of search windows can be alleviated and hence reduce the required external memory bandwidth. The proposed techniques can save up to 81% of external memory bandwidth and require only 135 MBytes/sec, while the quality degradation is less than 0.2 dB for 720 p HDTV clips coded at 8 Mbits/sec.
Edge Block Detection and Motion Vector Information Based Fast VBSME Algorithm
Qin LIU Yiqing HUANG Satoshi GOTO Takeshi IKENAGA

PAPER

Vol:
E91-A No:8
Page(s):
1935-1943
Compared with previous standards, H.264/AVC adopts variable block size motion estimation (VBSME) and multiple reference frames (MRF) to improve the video quality. Full search motion estimation algorithm (FS), which calculates every search candidate in the search window for 7 block type with multiple reference frames, consumes massive computation power. Mathematical analysis reveals that the aliasing problem of subsampling algorithm comes from high frequency signal components. Moreover, high frequency signal components are also the main issues that make MRF algorithm essential. As we know, a picture being rich of texture must contain lots of high frequency signals. So based on these mathematical investigations, two fast VBSME algorithms are proposed in this paper, namely edge block detection based subsampling method and motion vector based MRF early termination algorithm. Experiments show that strong correlation exists among the motion vectors of those blocks belonging to the same macroblock. Through exploiting this feature, a dynamically adjustment of the search ranges of integer motion estimation is proposed in this paper. Combing our proposed algorithms with UMHS almost saves 96-98% Integer Motion Estimation (IME) time compared to the exhaustive search algorithm. The induced coding quality loss is less than 0.8% bitrate increase or 0.04 dB PSNR decline on average.
Content-Aware Fast Motion Estimation for H.264/AVC
Zhenyu LIU Satoshi GOTO Takeshi IKENAGA

PAPER

Vol:
E91-A No:8
Page(s):
1944-1952
The key to high performance in video coding lies on efficiently reducing the temporal redundancies. For this purpose, H.264/AVC coding standard has adopted variable block size motion estimation on multiple reference frames to improve the coding gain. However, the computational complexity of motion estimation is also increased in proportion to the product of the reference frame number and the intermode number. The mathematical analysis in this paper reveals that the prediction errors mainly depend on the image edge gradient amplitude and quantization parameter. Consequently, this paper proposes the image content based early termination algorithm, which outperforms the original method adopted by JVT reference software, especially at high and moderate bit rates. In light of rate-distortion theory, this paper also relates the homogeneity of image to the quantization parameter. For the homogenous block, its search computation for futile reference frames and intermodes can be efficiently discarded. Therefore, the computation saving performance increases with the value of quantization parameter. These content based fast algorithms were integrated with Unsymmetrical-cross Multihexagon-grid Search (UMHexagonS) algorithm to demonstrate their performance. Compared to the original UMHexagonS fast matching algorithm, 26.14-54.97% search time can be saved with an average of 0.0369 dB coding quality degradation.
Noise Robust Motion Refinement for Motion Compensated Noise Reduction
Jong-Sun KIM Lee-Sup KIM

LETTER-Image Processing and Video Processing

Vol:
E91-D No:5
Page(s):
1581-1583
A motion refinement algorithm is proposed to enhance motion compensated noise reduction (MCNR) efficiency. Instead of the vector with minimum distortion, the vector with minimum distance from motion vectors of neighboring blocks is selected as the best motion vector among vectors which have distortion values within the range set by noise level. This motion refinement finds more accurate motion vectors in the noisy sequences. The MCNR with the proposed algorithm maintains the details of an image sequence very well without blurring and joggling. And it achieves 10% bit-usage reduction or 0.5 dB objective quality enhancement in subsequent video coding.
A 41 mW VGA@30 fps Quadtree Video Encoder for Video Surveillance Systems
Qin LIU Seiichiro HIRATSUKA Kazunori SHIMIZU Shinsuke USHIKI Satoshi GOTO Takeshi IKENAGA

PAPER

Vol:
E91-C No:4
Page(s):
449-456
Video surveillance systems have a huge market, as indicated by the number of installed cameras, particularly for low-power systems. In this paper, we propose a low-power quadtree video encoder for video surveillance systems. It features a low-complexity motion estimation algorithm, an application-specific ME-MC processor, a dedicated quadtree encoder engine and a processor control-based clock-gating technique. A chip capable of encoding 30 fps VGA (640480) at 80 MHz is fabricated using 0.18 µm CMOS technology. A total of 153 K gates with 558 kbits SRAM have been integrated into a 5.0 mm3.5 mm die. The power consumption is 40.87 mW at 80 MHz for VGA at 30 fps and 1.97 mW at 3.3 MHz for QCIF at 15 fps.
A Sub 100 mW H.264 MP@L4.1 Integer-Pel Motion Estimation Processor Core for MBAFF Encoding with Reconfigurable Ring-Connected Systolic Array and Segmentation-Free, Rectangle-Access Search-Window Buffer
Yuichiro MURACHI Junichi MIYAKOSHI Masaki HAMAMOTO Takahiro IINUMA Tomokazu ISHIHARA Fang YIN Jangchung LEE Hiroshi KAWAGUCHI Masahiko YOSHIMOTO

PAPER

Vol:
E91-C No:4
Page(s):
465-478
We describe a sub 100-mW H.264 MP@L4.1 integer-pel motion estimation processor core for low power video encoder. It supports macro block adaptive frame field (MBAFF) encoding and bi-directional prediction for a resolution of 19201080 pixels at 30 fps. The proposed processor features a novel hierarchical algorithm, reconfigurable ring-connected systolic array architecture and segmentation-free, rectangle-access search window buffer. The hierarchical algorithm consists of a fine search and a coarse search. A complementary recursive cross search is newly introduced in the coarse search. The fine search is adaptively carried out, based on an image analysis result obtained by the coarse search. The proposed systolic array architecture minimizes the amount of transferred data, and lowers computation cycles for the coarse and fine searches. In addition, we propose a novel search window buffer SRAM that has instantaneous accessibility to a rectangular area with arbitrary location. The processor core has been designed with a 90 nm CMOS design rule. Core size is 2.52.5 mm2. One core supports one-reference-frame and dissipates 48 mW at 1 V. Two core configuration consumes 96 mW for two-reference-frame search.
Parallel Improved HDTV720p Targeted Propagate Partial SAD Architecture for Variable Block Size Motion Estimation in H.264/AVC
Yiqing HUANG Zhenyu LIU Yang SONG Satoshi GOTO Takeshi IKENAGA

PAPER

Vol:
E91-A No:4
Page(s):
987-997
One hardware efficient and high speed architecture for variable block size motion estimation (VBSME) in H.264 is presented in this paper. By improving the pipeline structure and processing element (PE) circuits, the system latency and hardware cost is reduced, which makes this structure more hardware efficient than the original Propagate Partial SAD architecture. For small and middle frame size picture's coding, the proposed structure can save 12.1% hardware cost compared with original Propagate Partial SAD structure. In the case of HDTV, since small inter modes trivially contribute to the coding quality, we remove modes below 88 in our design. By adopting mode reduction technique, when the set number of PE array is less than 8, the proposed mode reduction based Propagate Partial SAD structure can work at faster clock speed and consume less hardware cost than widely used SAD Tree architecture. It is more robust to the high speed timing constraint when parallel processing is considered. With TSMC 0.18 µm technology in worst work conditions (1.62 V, 125), its peak throughput of 8-set PE array structure is 720p@30 Hz with 12864 search range and 5 reference frames. 12 k gates hardware cost can be reduced by our design compared with the parallel SAD Tree architecture.
An Irregular Search Window Reuse Scheme for MPEG-2 to H.264 Transcoding
Xiang-Hui WEI Shen LI Yang SONG Satoshi GOTO

PAPER-Image Coding and Video Coding

Vol:
E91-A No:3
Page(s):
749-755
Motion estimation (ME) is a computation-intensive module in video coding system. In MPEG-2 to H.264 transcoding, motion vector (MV) from MPEG-2 reused as search center in H.264 encoder is a simple but effective technique to simplify ME processing. However, directly applying MPEG-2 MV as search center will bring difficulties on application of data reuse method in hardware design, because the irregular overlapping of search windows between successive macro block (MB). In this paper, we propose a search window reuse scheme for transcoding, especially for HDTV application. By utilizing the similarity between neighboring MV, overlapping area of search windows can be regularized. Experiment results show that our method achieves average 93.1% search window reuse-rate in HDTV720p sequence with almost no video quality degradation. Compared to transcoding method without any data reuse scheme, bandwidth of the proposed method can be reduced to 40.6% of that.
High-Efficiency VLSI Architecture Design for Motion-Estimation in H.264/AVC
Chun-Lung HSU Mean-Hom HO

PAPER-System Level Design

Vol:
E90-A No:12
Page(s):
2818-2825
This paper proposes a flexible VLSI architecture design for motion estimation in H.264/AVC using a high-efficiency variable block-size decision (VBSD) approach. The proposed VBSD approach can perform a full motion search on integrating the 44 block sizes into 48, 84, 88, 816, 168, or 1616 block sizes and then appropriately select the optimal modes for motion compensation operating. In other words, the proposed architecture based on the VBSD approach can effectively reduce the encoding time of the motion estimation by dealing with different block sizes under 1616 searching range. Using the TSMC 0.18-µm CMOS technology, the proposed architecture has been successfully realized. Simulation and verification results show that the proposed architecture has significant bit-rate reduction and small PSNR degradation. Also, the physical chip design revealed that the maximum frame rate of this work can process 704 fps with QCIF (176144), 176 fps with CIF (352288) and 44 fps with 4CIF (704576) video resolutions under lower gate counts and higher working frequency.
Binary Motion Estimation with Hybrid Distortion Measure
Jong-Sun KIM Lee-Sup KIM

LETTER-Image Processing and Video Processing

Vol:
E90-D No:9
Page(s):
1474-1477
This paper proposes a new binary motion estimation algorithm that improves the motion vector accuracy by using a hybrid distortion measure. Unlike conventional binary motion estimation algorithms, the proposed algorithm considers the sum of absolute difference (SAD) as well as the sum of bit-wise difference (SBD) as a block-matching criterion. In order to reduce the computational complexity and remove additional memory accesses, a new scheme is used for SAD calculation. This scheme uses 8-bit data of the lowest layer already moved into the local buffer to calculate the SAD of other higher binary layer. Experimental results show that the proposed algorithm finds more accurate motion vectors and removes the blockishness of the reconstructed video effectively. We applied this algorithm to existing video encoder and obtained noticeable visual quality enhancement.
Efficient Motion Estimation for H.264 Codec by Using Effective Scan Ordering
Jeongae PARK Misun YOON Hyunchul SHIN

LETTER-Devices/Circuits for Communications

Vol:
E90-B No:7
Page(s):
1839-1843
Motion estimation (ME) is a computation intensive procedure in H.264. In ME for variable block sizes, an effective scan ordering method has been devised for early termination of absolute difference computation when the termination does not affect the performance. The new ME circuit with effective scan ordering can reduce the amount of computation by 70% compared to JM8.2 and by 30% compared to the disable approximation unit (DAU) approach.
A Multiple Block-matching Step (MBS) Algorithm for H.26x/MPEG4 Motion Estimation and a Low-Power CMOS Absolute Differential Accumulator Circuit
Tadayoshi ENOMOTO Nobuaki KOBAYASHI Tomomi EI

PAPER-Digital

Vol:
E90-C No:4
Page(s):
718-726
To drastically reduce the power dissipation (P) of an absolute difference accumulation (ADA) circuit for H.26x/MPEG4 motion estimation, a fast block-matching (BM) algorithm called the Multiple Block-matching Step (MBS) algorithm has been developed. The MBS algorithm can drastically improve the block matching speed, while achieving the same visual quality as that of a full search (FS) BM algorithm. Power dissipation (P) of a 0.18-µm CMOS absolute difference accumulator (ADA) circuit employing the MBS algorithm is significantly reduced to the range of about 0.3% to 12% that of the same ADA circuit adopting FS.
Lossy Strict Multilevel Successive Elimination Algorithm for Fast Motion Estimation
Yang SONG Zhenyu LIU Takeshi IKENAGA Satoshi GOTO

PAPER

Vol:
E90-A No:4
Page(s):
764-770
This paper presents a simple and effective method to further reduce the search points in multilevel successive elimination algorithm (MSEA). Because the calculated sea values of those best matching search points are much smaller than the current minimum SAD, we can simply increase the calculated sea values to increase the elimination ratio without much affecting the coding quality. Compared with the original MSEA algorithm, the proposed strict MSEA algorithm (SMSEA) can provide average 6.52 times speedup. Compared with other lossy fast ME algorithms such as TSS and DS, the proposed SMSEA can maintain more stable image quality. In practice, the proposed technique can also be used in the fine granularity SEA (FGSEA) algorithm and the calculation process is almost the same.
Parallel Adaptive Estimation of Hip Range of Motion for Total Hip Replacement Surgery
Yasuhiro KAWASAKI Fumihiko INO Yoshinobu SATO Shinichi TAMURA Kenichi HAGIHARA

PAPER-Parallel Image Processing

Vol:
E90-D No:1
Page(s):
30-39
This paper presents the design and implementation of a hip range of motion (ROM) estimation method that is capable of fine-grained estimation during total hip replacement (THR) surgery. Our method is based on two acceleration strategies: (1) adaptive mesh refinement (AMR) for complexity reduction and (2) parallelization for further acceleration. On the assumption that the hip ROM is a single closed region, the AMR strategy reduces the complexity for N N N stance configurations from O(N3) to O(ND), where 2≤D≤3 and D is a data-dependent value that can be approximated by 2 in most cases. The parallelization strategy employs the master-worker paradigm with multiple task queues, reducing synchronization between processors with load balancing. The experimental results indicate that the implementation on a cluster of 64 PCs completes estimation of 360360180 stance configurations in 20 seconds, playing a key role in selecting and aligning the optimal combination of artificial joint components during THR surgery.
Low-Power Partial Distortion Sorting Fast Motion Estimation Algorithms and VLSI Implementations
Yang SONG Zhenyu LIU Takeshi IKENAGA Satoshi GOTO

PAPER

Vol:
E90-D No:1
Page(s):
108-117
This paper presents two hardware-friendly low-power oriented fast motion estimation (ME) algorithms and their VLSI implementations. The basic idea of the proposed partial distortion sorting (PDS) algorithm is to disable the search points which have larger partial distortions during the ME process, and only keep those search points with smaller ones. To further reduce the computation overhead, a simplified local PDS (LPDS) algorithm is also presented. Experiments show that the PDS and LPDS algorithms can provide almost the same image quality as full search only with 36.7% computation complexity. The proposed two algorithms can be integrated into different FSBMA architectures to save power consumption. In this paper, the 1-D inter ME architecture [12] is used as an detailed example. Under the worst working conditions (1.62 V, 125) and 166 MHz clock frequency, the PDS algorithm can reduce 33.3% power consumption with 4.05 K gates extra hardware cost, and the LPDS can reduce 37.8% power consumption with 1.73 K gates overhead.
A Sub-mW H.264 Baseline-Profile Motion Estimation Processor Core with a VLSI-Oriented Block Partitioning Strategy and SIMD/Systolic-Array Architecture
Junichi MIYAKOSHI Yuichiro MURACHI Tetsuro MATSUNO Masaki HAMAMOTO Takahiro IINUMA Tomokazu ISHIHARA Hiroshi KAWAGUCHI Masayuki MIYAMA Masahiko YOSHIMOTO

PAPER-VLSI Architecture

Vol:
E89-A No:12
Page(s):
3623-3633
We propose a sub-mW H.264 baseline-profile motion estimation processor for portable video applications. It features a VLSI-oriented block partitioning strategy and low-power SIMD/systolic-array datapath architecture, where the datapath can be switched between an SIMD and systolic array depending on processing flow. The processor supports all the seven kinds of block modes, and can handle three reference frames for a CIF (352288) 30-fps to QCIF (176144) 15-fps sequences with a quarter-pixel accuracy. It integrates 3.3 million transistors, and occupies 2.83.1 mm2 in a 130-nm CMOS technology. The proposed processor achieves a power of 800 µW in a QCIF 15-fps sequence with one reference picture.

21-40hit(131hit)

Keyword Search Result

[Keyword] motion estimation(131hit)

H.264 Fast Inter-Mode Selection Based on Coded Block Patterns

An Ultra-Low Bandwidth Design Method for MPEG-2 to H.264/AVC Transcoding

A Fast Block Matching Algorithm Based on Motion Vector Correlation and Integral Projections

Sub-Pixel Motion Estimation Scheme Using Selective Interpolation

Wide-Range Motion Estimation Architecture with Dual Search Windows or High Resolution Video Coding

Edge Block Detection and Motion Vector Information Based Fast VBSME Algorithm

Content-Aware Fast Motion Estimation for H.264/AVC

Noise Robust Motion Refinement for Motion Compensated Noise Reduction

A 41 mW VGA@30 fps Quadtree Video Encoder for Video Surveillance Systems

A Sub 100 mW H.264 MP@L4.1 Integer-Pel Motion Estimation Processor Core for MBAFF Encoding with Reconfigurable Ring-Connected Systolic Array and Segmentation-Free, Rectangle-Access Search-Window Buffer

Parallel Improved HDTV720p Targeted Propagate Partial SAD Architecture for Variable Block Size Motion Estimation in H.264/AVC

An Irregular Search Window Reuse Scheme for MPEG-2 to H.264 Transcoding

High-Efficiency VLSI Architecture Design for Motion-Estimation in H.264/AVC

Binary Motion Estimation with Hybrid Distortion Measure

Efficient Motion Estimation for H.264 Codec by Using Effective Scan Ordering

A Multiple Block-matching Step (MBS) Algorithm for H.26x/MPEG4 Motion Estimation and a Low-Power CMOS Absolute Differential Accumulator Circuit

Lossy Strict Multilevel Successive Elimination Algorithm for Fast Motion Estimation

Parallel Adaptive Estimation of Hip Range of Motion for Total Hip Replacement Surgery

Low-Power Partial Distortion Sorting Fast Motion Estimation Algorithms and VLSI Implementations

A Sub-mW H.264 Baseline-Profile Motion Estimation Processor Core with a VLSI-Oriented Block Partitioning Strategy and SIMD/Systolic-Array Architecture

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles